-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[inference_metadata_fields] Clear inference results on explicit nulls #119145
base: inference_metadata_fields
Are you sure you want to change the base?
[inference_metadata_fields] Clear inference results on explicit nulls #119145
Conversation
Previously, setting a field explicitly to null in an update request did not work correctly with semantic text fields. This change resolves the issue by adding an explicit null entry to the `_inference_fields` metadata when such cases occur. The explicit null value ensures that any prior inference results are overwritten during the merge of the partial update with the latest document version.
…ds_explicit_null_fixes
…on" test to use multiple source fields
...ava/org/elasticsearch/xpack/inference/action/filter/ShardBulkInferenceActionFilterTests.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored into two YAML tests so that we can focus on the explicit nulls in one of them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! Overall looks good.
// ensure that the order in the original field is consistent in case of multiple inputs | ||
Collections.sort(responses, Comparator.comparingInt(FieldInferenceResponse::inputOrder)); | ||
Map<String, List<SemanticTextField.Chunk>> chunkMap = new LinkedHashMap<>(); | ||
for (var resp : responses) { | ||
// Get the first non-null model from the response list | ||
if (model == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to do any additional validation here, to verify that model
if it exists is compatible with resp.model
?
var valueObj = XContentMapValues.extractValue(sourceField, docMap, EXPLICIT_NULL); | ||
if (useLegacyFormat == false && isUpdateRequest && valueObj == EXPLICIT_NULL) { | ||
/** | ||
* It's an update request, and the source field is explicitly set to null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice comment here!
*/ | ||
var slot = ensureResponseAccumulatorSlot(itemIndex); | ||
slot.addOrUpdateResponse( | ||
new FieldInferenceResponse(field, sourceField, null, order++, 0, null, EMPTY_CHUNKED_INFERENCE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're incrementing order
here and we're still incrementing it later on on line 566 in existing code. Do we need to reset the value of order
before we iterate through values here? It's a bit confusing on read through.
- match: { hits.total.value: 1 } | ||
- match: { hits.total.relation: eq } | ||
|
||
- length: { hits.hits.0._source._inference_fields.sparse_field.inference.chunks: 1 } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know we check the offsets here but can we check the embeddings as well? (Verify the right chunk was removed, in a more visual manner?)
Pinging @elastic/search-eng (Team:SearchOrg) |
Pinging @elastic/search-relevance (Team:Search - Relevance) |
Fix a bug where setting a
semantic_text
source field explicitly tonull
in an update request to clear inference results did not actually clear the inference results for that field. This bug only affects the new_inference_fields
format.